Skip to content

Offload leaf search work to AWS Lambda functions#6157

Merged
fulmicoton merged 7 commits intomainfrom
lambda3
Feb 19, 2026
Merged

Offload leaf search work to AWS Lambda functions#6157
fulmicoton merged 7 commits intomainfrom
lambda3

Conversation

@fulmicoton-dd
Copy link
Collaborator

@fulmicoton-dd fulmicoton-dd commented Feb 12, 2026

Summary

The goal is to handle traffic spikes gracefully without provisioning additional searcher nodes: when the local search queue is saturated, overflow splits are transparently routed to Lambda for processing.

How offloading works

The offloading decision happens on the leaf side, inside the SearchPermitProvider. The permit provider already manages a bounded queue of pending split search tasks (gated by memory budget and download slots). When a leaf search request arrives, the provider checks the current queue depth against a configurable offload_threshold. If granting permits for all requested splits would exceed this threshold, only enough splits to fill up to the threshold are processed locally — the rest are marked for offloading.

The offloaded splits are batched (up to max_splits_per_invocation splits per batch, balanced by document count) and sent to Lambda in parallel. Each Lambda invocation runs the same leaf search code path and returns per-split results individually. This is important: the per-split responses are fed back into the IncrementalCollector and populate the partial result cache, so subsequent queries hitting the same splits benefit from cached results regardless of whether the split was searched locally or on Lambda.

Auto-deployment

Depending on the configuration, the Lambda function code can be deployed automatically at startup. The quickwit-lambda-client crate embeds a compressed Lambda binary at compile time. When auto_deploy is configured, Quickwit will:

  1. Check if a published Lambda version matching the current binary already exists (identified by a description tag quickwit:{version}-{hash})
  2. Create or update the function and publish a new version if needed
  3. Garbage-collect old versions (keeping the current one + 5 most recent)

This ensures the Lambda function always matches the running Quickwit version without any external deployment tooling. Manual deployment is also supported for users who prefer to manage Lambda functions through Terraform or other IaC tools.

Configuration

Lambda offloading is opt-in. Add a lambda section under searcher in the node configuration:

searcher:
  lambda:
    offload_threshold: 100     # queue depth before offloading kicks in (0 = always offload)
    max_splits_per_invocation: 10
    auto_deploy:
      execution_role_arn: arn:aws:iam::123456789012:role/quickwit-lambda-role
      memory_size: 5 GiB
      invocation_timeout_secs: 15

New crates

  • quickwit-lambda-client: Handles Lambda invocation (with metrics) and auto-deployment logic. Embeds the Lambda binary at build time.
  • quickwit-lambda-server: The Lambda function handler itself — receives a LeafSearchRequest, runs multi_index_leaf_search, and returns per-split LeafSearchResponses.

Key changes in existing crates

  • quickwit-search: New LambdaLeafSearchInvoker trait; SearchPermitProvider gains get_permits_with_offload to split work between local and offloaded; leaf.rs orchestrates local and Lambda tasks in parallel.
  • quickwit-config: New LambdaConfig and LambdaDeployConfig structs under SearcherConfig.
  • quickwit-serve: Initializes the Lambda invoker at startup when configured.
  • quickwit-proto: New LeafSearchResponses wrapper message for batched per-split responses.

@fulmicoton fulmicoton requested a review from Copilot February 12, 2026 12:51
@fulmicoton-dd fulmicoton-dd force-pushed the lambda3 branch 2 times, most recently from 63a7c92 to 0441e5f Compare February 12, 2026 12:58
@fulmicoton fulmicoton changed the title This PR introduces the ability to offload leaf search work to AWS Lam… Offload leaf search work to AWS Lambda functions Feb 12, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in AWS Lambda “overflow” execution path for leaf split search to handle traffic spikes without adding more searcher nodes, including auto-deploy of the Lambda binary and per-split result integration into the existing partial result cache / incremental merge flow.

Changes:

  • Introduces quickwit-lambda-client (invocation + auto-deploy + metrics) and quickwit-lambda-server (Lambda handler running Quickwit leaf search).
  • Extends searcher configuration/context to support Lambda offloading, and updates leaf-search scheduling to split work between local permits and Lambda batches.
  • Adds protobuf support for batched per-split responses plus docs and CI workflow for publishing the Lambda binary.

Reviewed changes

Copilot reviewed 43 out of 45 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
quickwit/quickwit-storage/src/cache/memory_sized_cache.rs Adds a regression test for CacheConfig::no_cache() behavior.
quickwit/quickwit-serve/src/lib.rs Initializes Lambda invoker on startup when searcher+lambda are configured.
quickwit/quickwit-serve/Cargo.toml Adds dependency on quickwit-lambda-client.
quickwit/quickwit-search/src/tests.rs Updates tests for SearcherContext::new(..., lambda_invoker) signature changes.
quickwit/quickwit-search/src/service.rs Extends SearcherContext to carry an optional Lambda invoker.
quickwit/quickwit-search/src/search_permit_provider.rs Adds offload-aware permit acquisition logic (threshold-based truncation).
quickwit/quickwit-search/src/root.rs Minor tracing import/use adjustment.
quickwit/quickwit-search/src/list_terms.rs Adjusts permit sizing collection (prep for offload-aware behavior).
quickwit/quickwit-search/src/lib.rs Exposes new invoker module and re-exports LambdaLeafSearchInvoker.
quickwit/quickwit-search/src/leaf_cache.rs Minor whitespace cleanup.
quickwit/quickwit-search/src/leaf.rs Implements local-vs-Lambda scheduling, batching, parallel execution, and merge integration.
quickwit/quickwit-search/src/invoker.rs Introduces LambdaLeafSearchInvoker trait abstraction.
quickwit/quickwit-proto/src/error.rs Updates error header doc comment wording.
quickwit/quickwit-proto/src/codegen/quickwit/quickwit.search.rs Adds LeafSearchResponses wrapper message to generated code.
quickwit/quickwit-proto/protos/quickwit/search.proto Adds LeafSearchResponses proto definition.
quickwit/quickwit-lambda/README.md Removes old deprecation stub text.
quickwit/quickwit-lambda-server/src/lib.rs Defines Lambda server crate exports/modules.
quickwit/quickwit-lambda-server/src/handler.rs Implements Lambda handler: decode request, run per-split searches, encode responses.
quickwit/quickwit-lambda-server/src/error.rs Adds Lambda error types and conversions.
quickwit/quickwit-lambda-server/src/context.rs Builds Lambda-optimized SearcherConfig from env and sets caches to no_cache.
quickwit/quickwit-lambda-server/src/config.rs Adds a (currently empty) config stub file.
quickwit/quickwit-lambda-server/src/bin/leaf_search.rs Provides Lambda binary entrypoint using lambda_runtime.
quickwit/quickwit-lambda-server/Cargo.toml Adds Lambda server crate definition and dependencies/features.
quickwit/quickwit-lambda-client/src/metrics.rs Adds Prometheus metrics for Lambda invocation and payload sizes.
quickwit/quickwit-lambda-client/src/lib.rs Exposes deploy/invoker APIs and payload types.
quickwit/quickwit-lambda-client/src/invoker.rs Implements AWS Lambda invocation + response decoding into per-split responses.
quickwit/quickwit-lambda-client/src/deploy.rs Implements auto-deploy logic (version discovery/publish + GC).
quickwit/quickwit-lambda-client/build.rs Downloads and embeds Lambda zip; computes content hash for versioning.
quickwit/quickwit-lambda-client/README.md Documents Lambda release process and content-based versioning.
quickwit/quickwit-lambda-client/Cargo.toml Adds Lambda client crate definition and dependencies/build deps.
quickwit/quickwit-config/src/node_config/serialize.rs Extends serialization tests to cover lambda config.
quickwit/quickwit-config/src/node_config/mod.rs Adds LambdaConfig, LambdaDeployConfig, SearcherConfig.lambda, and CacheConfig::no_cache().
quickwit/quickwit-config/src/lib.rs Re-exports lambda config types.
quickwit/quickwit-config/resources/tests/node_config/quickwit.yaml Adds lambda section to test YAML config.
quickwit/quickwit-config/resources/tests/node_config/quickwit.toml Adds lambda section to test TOML config.
quickwit/quickwit-config/resources/tests/node_config/quickwit.json Adds lambda section to test JSON config.
quickwit/quickwit-config/Cargo.toml Adds quickwit-common testsuite feature dep for config tests.
quickwit/quickwit-aws/src/lib.rs Bumps AWS SDK behavior version used in defaults.
quickwit/Cargo.toml Adds new crates to workspace members + workspace deps (lambda_runtime, ureq, zip, aws-sdk-lambda, aws-smithy-mocks).
quickwit/Cargo.lock Locks new dependencies (aws-sdk-lambda, lambda_runtime, ureq, zip, etc.).
docs/configuration/lambda-config.md Adds end-user documentation for lambda offloading + IAM + deployment.
LICENSE-3rdparty.csv Updates third-party license list for new deps.
.github/workflows/publish_lambda.yaml Adds workflow to build and draft-release the Lambda binary zip.
.github/actions/cross-build-binary/action.yml Pins upload-to-github-release action by commit SHA.
.github/actions/cargo-build-macos-binary/action.yml Pins upload-to-github-release action by commit SHA.
Comments suppressed due to low confidence (1)

quickwit/quickwit-lambda-server/src/config.rs:17

  • src/config.rs appears to be an unused stub (not referenced via mod config; and contains only imports). If it’s not needed, it should be removed; if it is intended to host config parsing, wire it up and add the missing implementation.
use anyhow::Context as _;
use bytesize::ByteSize;


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@fulmicoton-dd fulmicoton-dd force-pushed the lambda3 branch 2 times, most recently from e212982 to 94b0a4d Compare February 12, 2026 13:48
@fulmicoton-dd fulmicoton-dd marked this pull request as ready for review February 12, 2026 13:48
The goal is to handle traffic spikes gracefully without provisioning additional searcher nodes: when the local search queue is saturated, overflow splits are transparently routed to Lambda for processing.

The offloading decision happens **on the leaf side**, inside the `SearchPermitProvider`. The permit provider already manages a bounded queue of pending split search tasks (gated by memory budget and download slots). When a leaf search request arrives, the provider checks the current queue depth against a configurable `offload_threshold`. If granting permits for all requested splits would exceed this threshold, only enough splits to fill up to the threshold are processed locally — the rest are marked for offloading.

The offloaded splits are batched (up to `max_splits_per_invocation` splits per batch, balanced by document count) and sent to Lambda in parallel. Each Lambda invocation runs the same leaf search code path and **returns per-split results individually**. This is important: the per-split responses are fed back into the `IncrementalCollector` and populate the **partial result cache**, so subsequent queries hitting the same splits benefit from cached results regardless of whether the split was searched locally or on Lambda.

Depending on the configuration, the Lambda function code can be **deployed automatically** at startup. The `quickwit-lambda-client` crate embeds a compressed Lambda binary at compile time. When `auto_deploy` is configured, Quickwit will:
1. Check if a published Lambda version matching the current binary already exists (identified by a description tag `quickwit:{version}-{hash}`)
2. Create or update the function and publish a new version if needed
3. Garbage-collect old versions (keeping the current one + 5 most recent)

This ensures the Lambda function always matches the running Quickwit version without any external deployment tooling. Manual deployment is also supported for users who prefer to manage Lambda functions through Terraform or other IaC tools.

Lambda offloading is opt-in. Add a `lambda` section under `searcher` in the node configuration:

```yaml
searcher:
  lambda:
    offload_threshold: 100     # queue depth before offloading kicks in (0 = always offload)
    max_splits_per_invocation: 10
    auto_deploy:
      execution_role_arn: arn:aws:iam::123456789012:role/quickwit-lambda-role
      memory_size: 5 GiB
      invocation_timeout_secs: 15
```

- **`quickwit-lambda-client`**: Handles Lambda invocation (with metrics) and auto-deployment logic. Embeds the Lambda binary at build time.
- **`quickwit-lambda-server`**: The Lambda function handler itself — receives a `LeafSearchRequest`, runs `multi_index_leaf_search`, and returns per-split `LeafSearchResponse`s.

- **`quickwit-search`**: New `LambdaLeafSearchInvoker` trait; `SearchPermitProvider` gains `get_permits_with_offload` to split work between local and offloaded; `leaf.rs` orchestrates local and Lambda tasks in parallel.
- **`quickwit-config`**: New `LambdaConfig` and `LambdaDeployConfig` structs under `SearcherConfig`.
- **`quickwit-serve`**: Initializes the Lambda invoker at startup when configured.
- **`quickwit-proto`**: New `LeafSearchResponses` wrapper message for batched per-split responses.

build fix
Copy link
Contributor

@trinity-1686a trinity-1686a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review still in progress

Copy link
Contributor

@trinity-1686a trinity-1686a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should somehow mark this feature as Experimental for the time being (warn on startup when it's enabled maybe?). i suspect there might be changes we'll want to do if/when we add support for other faas provider, in particular on the side of what config looks like.

Comment on lines -466 to -473
if let Some(cached_answer) = ctx
.searcher_context
.leaf_search_cache
.get(split.clone(), search_request.clone())
{
leaf_search_state_guard.set_state(SplitSearchState::CacheHit);
return Ok(Some(cached_answer));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this could stay: it could happen that we get a cache miss at the top level, wait for a bit while acquiring a permit, and now the value is there

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I readded it with a comment explaining the redundancy

@fulmicoton
Copy link
Collaborator

Refactoring needed:
Lambda should be able to see a single split fail without throwing away all of their work. Results shouldn't need to be ordered. The results should be a list of named (by split) results.

Need to investigate if the reuse of the leaf result protobuf is really mandatory here.

Are we sure merging can happen on the tokio loop?

@trinity-1686a
Copy link
Contributor

we're not.

Lambda now details (but does not retry) which split failed or was successful.
Leaf cache lambda individual split result, (keep track of the rewritten)
request to do so).
Lambda handler returned split-id named results in any order.
@trinity-1686a trinity-1686a self-requested a review February 19, 2026 09:22
You can deploy the Lambda function manually without `auto_deploy`:
1. Download the Lambda zip from [GitHub releases](https://github.com/quickwit-oss/quickwit/releases)
2. Create or update the Lambda function using AWS CLI, Terraform, or the AWS Console
3. Publish a version with description format `quickwit:{version}-{sha1}` (e.g., `quickwit:0_8_0-fa752891`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is that still accurate now that description contains some configuration params?

Comment on lines +72 to +73
/// We also pass the deploy config, as we want the function to be redeployed
/// if the deployed is changed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// We also pass the deploy config, as we want the function to be redeployed
/// if the deployed is changed.
/// We also pass the deploy config, as we want the function to be redeployed
/// if the config is changed.

@fulmicoton fulmicoton enabled auto-merge (squash) February 19, 2026 12:22
@fulmicoton fulmicoton merged commit 8412e6d into main Feb 19, 2026
8 checks passed
@fulmicoton fulmicoton deleted the lambda3 branch February 19, 2026 12:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants